Speech recognizer performance in car and home applications utilizing novel multiple microphone configurations

ABSTRACT

System speakers are switched to function as sound input transducers to improve recognizer performance and to support recognizer features. A crossbar switch is selectively activated, either manually or under software control, to allow system loudspeakers to function as sound input transducers that supplement the recognition system microphone or microphone array. Using loudspeakers as “microphones” improves speech recognition in noisy environments, thus attaining better recognition performance with little added system cost. The loudspeakers, positioned in physically separate locations also provide spatial information that can be used to determine the location of the person speaking and thereby offer different functionality for different persons. Acoustic models are selected based on environmental and vehicle operating conditions and may be adapted dynamically using ambient information obtained using the loudspeakers as sound input transducers.

FIELD OF THE INVENTION

The present invention relates generally to speech recognition systems.More particularly, the invention relates to an improved recognizersystem, useful in a variety of applications and with a variety ofelectronic systems that use loudspeakers to provide sound to the user.The invention advantageously switches the loudspeakers from their normalsound reproduction mode to a voice or sound input mode and the voice orsound signal so input is then processed to enhance recognizerperformance and to support additional recognizer features.

BACKGROUND OF THE INVENTION

To deploy an automatic speech recognizer in an automobile, or at anotherlocation, one or more microphones may need to be installed. Usingmultiple microphones can improve recognition results in noisyenvironments, but the installation costs can be prohibitive,particularly where the recognition system is installed in a system thatwas not originally designed for that purpose. In automotiveapplications, speech recognition features are typically integrated intothe audio system of the car, using a single microphone, or a microphonearray, that has a single cable for connecting it to the audio system. Insuch case, the audio system includes an input port to which themicrophone cable is connected. Thus, even when the audio system includessuch a port, it can be cost prohibitive to retrofit such an audio systemwith a recognizer that takes advantages of additional microphones (i.e.,microphones in addition to the microphone or microphone array that wasengineered for the system).

Using multiple microphones helps with removing noise. It also helps whenmore than one person is speaking, as the recognizer may be able toselect the desired speaker by utilizing spatial information. In amultiple microphone system, this would be done by properly combining thesignals received from the multiple microphones to acquire the spatialinformation. In an automotive application, it could be useful to have arecognition system that responds to certain voice commands only whenuttered by the vehicle driver. With a single microphone, it can be verydifficult to determine whether the person uttering the command is thedriver, as opposed to another vehicle passenger. With multiplemicrophones it is much easier to discriminate among the speakers,particularly if the microphones are scattered throughout the vehicle.However, with current technology there is no economical way toaccomplish this.

Using multiple microphones can also be beneficial in other applications.A second exemplary application involves deployment of automatic speechrecognition for control of home entertainment systems. As in the carapplication, multiple microphones can help to remove noise and to selectthe desired speaker. Additionally, in home applications multiplemicrophones can be further applied to help reduce the adverse effectsupon speech recognition of room reverberations.

SUMMARY OF THE INVENTION

The present invention provides an improved speech recognition systemthat may be coupled to an audio system or audio/video system to addspeech recognition features to those systems and improve recognitionperformance. The system employs a multi-channel signal processor and asignal switch. The switch is adapted for placement between the audiosystem or audio/video system and the associated loudspeakers. In onestate, the switch connects the loudspeakers to the audio system, so thatthe audio signal content may be supplied to the speakers for playback inthe usual fashion. When switched to a second state, the switch decouplesthe loudspeakers from the audio system and instead couples them to inputchannels (one channel per loudspeaker) of the multi-channel signalprocessor. A microphone is coupled to another input channel of themulti-channel signal processor. The signal processor may be configuredto provide a number of different processing operations, such as noiseremoval operations and spatial speaker localization operations. Theoutput of the multi-channel processor may be fed to a speech recognizerwhich in turn controls system functions within the audio system oraudio/video system.

Another aspect of the invention involves the automatic inclusion ofenvironmental conditions to achieve more accurate speech recognition innoisy environments, such as within automotive vehicles. Speechrecognition from a moving vehicle can be severely degraded by theambient noise. The ambient noise in a vehicle is typically atime-varying phenomenon and may emanate from a variety of differentsources such as:

-   -   noise from the engine and the revolving mechanical parts of the        vehicle, vibration noise from the surface contact of the wheels        and roadway, the noise from air drawn into the vehicle through        ducts or open windows, noise from passing/overtaking vehicles,        clicks from turn indicators, etc. Each type of vehicle generates        different noise frequency characteristics (e.g., BMW series        generates wide-band noises and Volvo series generates        narrow-band noises).

The improved recognizer system of the invention will automaticallyextract the environmental information through the available in-vehiclesensors, including the in-vehicle loudspeakers used as sound transducersas explained herein. The system processes this information to determinethe type(s) of noise present in the ambient background and uses theprocessed information to select the optimal acoustic models for speechrecognition. In addition, the ambient background information so obtainedmay be used to train different noise models for different noiseconditions, as they are experienced during vehicle operation. Thetrained noise models may then be selected, based on current noiseconditions, when recognition is performed.

Further areas of applicability of the present invention will becomeapparent from the detailed description provided hereinafter. It shouldbe understood that the detailed description and specific examples, whileindicating the preferred embodiment of the invention, are intended forpurposes of illustration only and are not intended to limit the scope ofthe invention.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will become more fully understood from thedetailed description and the accompanying drawings, wherein:

FIG. 1 is a block diagram illustrating a presently preferred embodimentof the improved recognition system;

FIG. 2 is a signal processing diagram illustrating the spectrummagnitude method applied by the signal processing system;

FIG. 3 is a detailed block diagram of an embodiment of the speechrecognition system, illustrating how system control functions can beimplemented through voiced commands;

FIG. 4 is a perspective view of an automobile cockpit, illustrating howthe invention may be integrated into the audio system of the vehicle;

FIG. 5 is a diagrammatic view of an audio/video system, illustrating anembodiment of the recognition system suitable for control of homeelectronic components by voiced command; and

FIG. 6 is a block diagram of the inventive system for training and usingnoise models adapted to different noise conditions within the vehicle.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The following description of the preferred embodiment(s) is merelyexemplary in nature and is in no way intended to limit the invention,its application, or uses.

FIG. 1 shows an exemplary embodiment of the improved speech recognizersystem. In the illustrated embodiment the recognizer system is coupledto an audio system 10, which has a plurality of audio output ports, asat 12, through which audio output signals are provided. In theillustrated embodiment, the audio system 10 supplies audio signals to aset of front speakers 14 a and 14 b and a set of rear speakers 16 a and16 b. The front and rear speakers each provide left channel and rightchannel information, respectively. In FIG. 1, a single line has beenshown to represent the left-right stereo pair. This has been done tosimplify the drawing. Those skilled in the art will appreciate thatseparate sets of conductors are normally used to supply the left andright channels, respectively.

The improved recognition system is incorporated into the audio system byprovision of the crossbar switch 18. As illustrated, switch 18 has aplurality of input ports 20 to which the audio system 10 is coupled anda plurality of ports 22 to which the loudspeakers are coupled. Thecrossbar switch is further coupled through a signal processor input bus24, which may include plurality of signal lines that communicate withthe multi-channel signal processor 26.

Crossbar switch 18 has two switching states. In a first state the ports20 are coupled to the ports 22. In this first state the audio system 10is thus coupled to the loudspeakers, thereby allowing audio signals tobe routed to the loudspeakers for playback in the usual fashion.

Crossbar switch 18 has a second switching state that decouples theloudspeakers from the audio system and instead couples the loudspeakersto the signal processor 26. In this second switching state theloudspeakers function as sound input transducers (i.e., as microphonedevices).

In one embodiment the crossbar switch couples and decouples allloudspeakers simultaneously. In that embodiment, all loudspeakers areswitched between audio playback state and sound input transducer statesimultaneously. In an alternate embodiment, crossbar switch 18 iscapable of independent speaker channel switching. In this embodiment aselected speaker can be switched from audio playback to sound input modewhile the remaining loudspeakers remain in playback mode. If desired,the crossbar switch can be also provided with signal attenuators toreduce the sound output volume of loudspeakers in the playback mode whenone or more loudspeakers have been switched to the sound input mode.

The signal processor 26 also includes an audio input to which amicrophone 28 is coupled. Microphone 28 serves as the primary input forreceiving voiced commands that are then processed by the speechrecognizer 30. Signal processor 26 digitizes the audio input signalsfrom microphone 28 and from the input channels 24 and then processes theresulting digital data to optimize it for use by the recognizer 30. Suchoptimization can include performing noise cancellation algorithms(discussed below) and speaker localization or source separationalgorithms (also discussed below).

In the embodiment illustrated in FIG. 1, signal processor 26 alsoeffects software control over the crossbar switch 18 via the controlline 32 shown in dotted lines in FIG. 1. When a user utters a voicedcommand, the command is initially picked up by microphone 28. Signalprocessor 26, upon detecting speech input from microphone 2B, sends acontrol signal to the crossbar switch, causing it to switch to the soundinput mode for one or more of the loudspeakers 14 a, 14 b, 16 a, and 16b. In this embodiment the signal processor thus automatically switchesthe system for improved recognizer performance based on receipt of avoice command through microphone 28.

This automatic operation can be accomplished in a variety of ways. Oneway uses the signal processor 26 to continually monitor the sound inputlevel and other spectral characteristics of the input from microphone28. The signal processor acquires information about the ambientbackground noise by averaging the input signal from microphone 28 over apredetermined time interval that is substantially longer than the voicedcommands for which the system is designed to recognize. The ambientbackground level is then subtracted out from the signal input frommicrophone 28, so that voiced command utterances are readilydiscriminated from the background ambient noise level.

If desired, the signal processor can also receive an audio signalthrough the input bus 24. This input signal can supply the signalprocessor with the audio signal being sent to the loudspeakers. Bysubtracting out this signal (which microphone 28 is picking up) themicrophone can be made further sensitive to voiced commands.

An alternate processing technique relies upon recognizer 30 to recognizethe voiced commands received through microphone 28 and initiallyprocessed by signal processor 26 without having information from theloudspeakers. In this alternate embodiment the recognizer can detectparticular utterances, such as particular command words or phrases, andthen send a control signal to signal processor 26, informing it that thecrossbar switch 18 needs to be switched to the sound input mode. Thus, aparticular voiced command by a user can be used to signal the systemthat it needs to switch to the sound input mode whereby one or more ofthe loudspeakers serve as auxiliary sound input transducers.

Another more sophisticated embodiment uses the confidence level achievedby the recognizer to determine when noise cancellation or other signalprocessing operations are needed. Upon detecting such conditions, thesignal processor is notified via the control line 34 and it, in turn,signals the crossbar switch via line 32 to switch to the sound inputstate. This functionality may be implemented by monitoring therecognition score or probability of match score generated by therecognizer as it operates upon the input data. When recognitionconfidence drops below a predetermined level, the recognizer detectsthis and sends a control message to the signal processor 26.

Because the crossbar switch is under software control, by the signalprocessor 26 and also by the recognizer in some applications, theloudspeakers can be used to acquire useful information about therecognition environment that would not otherwise be available throughthe single microphone 28. In the environment learning mode, theloudspeakers are individually switched, one at a time, while apredetermined time segment of input sound is sampled and stored forfurther analysis. By cycling through all of the loudspeakers in thisfashion, the system acquires spatial information about the sound fieldwithin which the microphone 28 is placed. Acquiring information of thesound field can be quite beneficial in fine tuning the signal processingalgorithms used to enhance recognition. For example, if the system needsto recognize a particular person who is speaking among a group ofpersons, the sound field information can tell where that person islocated relative to the others. Once the location has been determined,the utterances of the other persons can be rejected based on spatialcues.

The learning mode described above may be performed at very high speed byutilizing a solid-state crossbar switching circuit. Thus the system cancycle through successive loudspeakers, to acquire sound fieldinformation, without the audio content of the playback material beingnoticeably degraded.

FIG. 2 shows a signal processing algorithm that may be used to enhancerecognizer performance. As illustrated, signals from the loudspeakersand microphone, respectively, are converted into the spectrum magnitudedomain by processing blocks 50 and 52, respectively. An alignmentoperation is then performed at 54 and the resulting aligned signal isthen subtracted from the spectrum magnitude signal originating frommicrophone 28. After subtraction as at block 56, the processed signal isthen passed to the recognizer 30.

Processing in this fashion effectively subtracts the background noisefrom the speech, so that the speech can be processed more effectively bythe recognizer 30. The processing operation is typically calibratedprior to use by allowing the reference microphone to sample onlybackground noise. If the reference microphone receives both speech andnoise, then a source separation technique may be used. The sourceseparation technique uses independent component analysis (ICA) toseparate the speech and noise. The microphone will have speech andnoise, and the loudspeakers being used as sound input transducers willalso have speech and noise, but with a different transfer function. Inthe frequency domain these two input signals can be written according tothe matrix equation below: $\begin{bmatrix}m_{1} \\m_{2}\end{bmatrix} = {{\begin{bmatrix}a_{11} & a_{12} \\a_{21} & a_{22}\end{bmatrix}\begin{bmatrix}s \\n\end{bmatrix}} = {M\begin{bmatrix}s \\n\end{bmatrix}}}$

In the above matrix equation M₁ and M₂ are the two input signals, whilea₁₁, a₁₂, a₂₁ and a₂₂ are transfer functions. The s and n terms arespeech and noise, respectively. If the matrix M is not singular, thesignal and noise signals can be recovered by: $\begin{bmatrix}s \\n\end{bmatrix} = {M^{- 1}\begin{bmatrix}m_{1} \\m_{2}\end{bmatrix}}$

The independent component analysis will find the inverse of M, using agradient descent algorithm. The recovered speech is then fed to thespeech recognizer 30. If applied directly to the sound signal, ICA cantake a considerable amount of computational power. This power can besubstantially reduced if the signal is split into frequency bands andICA is applied, band by band. The frequency band representation may besent directly to the recognizer.

Referring now to FIG. 3, an exemplary application of the improvedrecognizer is illustrated. Many of the component parts illustrated havebeen described above and will not be repeated here. As illustrated inFIG. 3, the recognizer is coupled to a control system 60, which may, inturn, be coupled to the audio system 10 (or audio/video system). Thecontrol system can also be connected to other devices for whichoperational control is desired. The control system may be provided witha memory for storing a feature set database illustrated diagrammaticallyat 62. The feature set database stores the identity of various devicesand operational features and functions of those devices in associationwith the identity of different persons who will be using the system. Thefeature set database is used to dictate which of the various controlfunctions certain individual persons have authority to operate. In anautomotive application, for example, certain vehicular functions may bedesignated for the vehicle driver only. Using information about thespatial location of the vehicle occupants, the system is able toascertain whether the driver or one of the passengers has uttered aparticular voice command. It the command is one that is reserved for thedriver only, it will be performed only if the driver utters it. Thecommand will be ignored if other vehicle occupants utter it.

While it should be apparent that the recognition system of the inventioncan be used in a variety of different applications, two examples of suchsystems will be provided next in order to illustrate some of the waysthat the invention may be deployed.

Referring to FIG. 4, a vehicle cockpit is shown at 80. The vehicle audiosystem is shown at 82, with two of the audio system loudspeakersillustrated at 84 and 86. Other loudspeakers would be also provided inother locations of the vehicle (not shown). A microphone 28 is providedin a suitable location, such as within the rearview mirror assembly. Ifdesired, the rearview mirror assembly may also have a system activationbutton 88 that the user presses to turn on the recognition system. Suchbutton is optional, as the recognition system can be configured to workautomatically, as described previously. The recognizer of the inventioncan be housed in a suitable package having connectors for pluggingbetween the audio system 82 and the loudspeakers. The package isdesigned to accept the standard wiring harness plugs and jacks foundwithin the vehicle. This has an advantage in that the wiring harness andloudspeaker installation may be the same for vehicles with speechrecognition and without it. This saves on manufacturing and inventorycosts.

FIG. 5 illustrates a home entertainment system with the recognitionsystem employed. In the home entertainment system, the microphone 28 maybe placed in a suitable location, such as at a fixed location within theviewing room, or within the remote control of one of the components.When placed in the remote control, wireless or infrared communicationmay be used to communicate the spoken utterance back to the signalprocessing unit.

In some implementations, it may be beneficial to provide the recognizerwith different acoustic models for different noise conditions. Therecognizer system of the invention makes provision for this usingambient noise measuring and acoustic model selection system illustratedin FIG. 6. The system maintains a pool of acoustic models, stored inacoustic model memory 100. An intelligent decision logic unit 102predicts or determines the current noise conditions based on a varietyof factors that are supplied as inputs to the logic unit 102, asillustrated. The logic unit supplies an ambient noise identificationsignal at 104 to an acoustic model selection module 106. The selectionmodule 106 selects the appropriate acoustic model from memory 100, basedon the signal at 104 and supplies this model to the model adaptationmodule 108. Model selections can be made prior to and/or during therecognition session. Module 108, in turn generates or supplies theadapted model to the pattern matching engine 110 of the recognizer. Theintelligent decision logic unit 102 may also be configured to supply acontrol signal at 112 to provide background noise information to themodel adaptation module 108.

In addition to providing adapted acoustic models for recognition, thesystem may also be configured to perform noise compensation upon theinput speech signal prior to recognition and/or to change compensationparameters during a recognition session. As illustrated in FIG. 6, anambient noise identification signal is supplied at 114 to a noisecompensation module 116. Signal 114 provides the noise compensationmodule with information about the type of noise in the current ambientbackground. The noise compensation module performs processing of theinput speech signal to remove or reduce the effects of the noise. In apresently preferred embodiment, noise compensation is performed in aparametric domain, after the input speech signal has been processed bythe feature extraction module 118, as illustrated.

The front-end (noise compensation) processing operations can be selectedaccording to current noise conditions. If the noise is minimal, thenperceptual linear prediction features can be selected for recognition.If the noise is greater then a sub-band feature can be selected forrecognition. If the noise is null, Mel frequency cepstral coefficientfeatures may be selected.

While there can be a wide assortment of different factors that affectwhat noise is present in the ambient background, the following areoffered as some examples. Suitable sensors would be provided to capturethe following operating parameters:

-   -   Engine is on or off.    -   Speed of the vehicle, e.g., 30 mph (residential), 40 mph (city),        65 mph (highway).    -   Accelerator position (the speed will be lower if the vehicle is        climbing a mountain, but the accelerator will be more fully        depressed and engine noise will be greater).    -   Engine rpm.    -   Age of the vehicle.    -   Model of the vehicle (sports car, family sedan, minivan, SUV,        motor home, school bus, etc.). This information can also serve        to inform the logic unit of the number of speakers that can be        estimated a-priori.    -   Window open or closed.    -   Sensors under vehicle seat(s) or at the entrance of each door,        so the system can precisely estimate the number of persons        inside the vehicle. (Pets, like cats and dogs, can be recognized        similarly, and the system will detect that these occupants will        not be providing speech input to be recognized.)    -   Windshield wipers on and off.    -   Convertible top up or down/sunroof open or closed.    -   Radio, music, dvd on or off.    -   Global Positioning Satellite (GPS)—vehicle location. This        information is used to learn street location. The type of        roadway surface can be stored for each location and this        information used to predict noise level. In this regard, a        concrete roadway provides a different background noise than a        blacktop, gravel or dirt surface. In addition, the background        noise associated with each type of surface changes differently        when wet. GPS information may also be used to determine whether        the vehicle is approaching a train track (railway crossing) or        moving near the ocean (surf noise) or climbing up a mountain.    -   Real-time weather and traffic conditions.    -   Air conditioning system is on or off.

The acoustic models stored in memory 100 can be preconfigured anddownloaded through suitable data connection, such at a vehicle servicecenter or via vehicle internet connection. Alternatively, the system canmeasure background noise, using the sound transducers as describedherein, and then generate its own acoustic models. Generating the modelsitself, allows the models to become adapted as the vehicle age changes.As the vehicle ages, its noise characteristics change (more rattles,louder muffler, etc.).

The acoustic models may be trained according to most of the noisyconditions and the best fitting model is selected according to thedeterministic information from all of the sensors (described above).Model adaptation can also be done on the selected model to enhance theinter-speaker and intra-speaker variabilities. FIG. 6 thus illustrates amodel training module 120 that provides new models, or re-trainsexisting models within memory 100. The model training module receivesinformation about ambient noise conditions from the loudspeaker system122, and/or from microphone 124.

The description of the invention is merely exemplary in nature and,thus, variations that do not depart from the gist of the invention areintended to be within the scope of the invention. Such variations arenot to be regarded as a departure from the spirit and scope of theinvention.

1. A signal processing system to improve speech recognition, comprising:a loudspeaker system for playback of audio content; a sound transducerfor input of speech utterances from a recognition system user; amulti-channel signal processing system configured to receive speechutterance input from said sound transducer and from at least oneadditional audio source, the signal processing system being operative toenhance the recognition quality of the received speech utterance; aswitching system coupled to said signal processing system and to saidloudspeaker system, the switching system being operative to selectivelycouple said loudspeaker system to said signal processing system therebyutilizing said loudspeaker system as said additional audio source. 2.The system of claim 1 wherein said signal processing system isconfigured to increase discrimination between speech utterances producedby one source from sounds produced by other sources.
 3. The system ofclaim 1 wherein said signal processing system is configured to performsource separation.
 4. The system of claim 1 wherein said signalprocessing system is configured to perform source separation usingindependent component analysis to separate speech from noise.
 5. Thesystem of claim 4 wherein said independent component analysis isperformed in the frequency domain.
 6. The system of claim 1 furthercomprising a recognizer coupled to said signal processor.
 7. The systemof claim 6 further comprising a control system coupled to saidrecognizer and operative to control a function of an electroniccomponent.
 8. The system of claim 7 wherein said electronic component isan audio system or audio/video system coupled to said loudspeakersystem.
 9. The system of claim 7 wherein said control system includes amemory for storing first data regarding electronic component function inassociation with second data indicative of at least one recognitionsystem user.
 10. The system of claim 9 wherein said control system usessaid first and second data to selectively permit control of a functionof an electronic component.
 11. The system of claim 1 wherein saidsignal processing system controls said switching system.
 12. The systemof claim 6 wherein said recognizer controls said switching system.
 13. Amethod of improving speech recognition, comprising: selectivelyinterrupting playback of audio content through a loudspeaker systemwhile concurrently utilizing said loudspeaker system as a sound inputtransducer; receiving an input speech utterance from a recognitionsystem user through a plurality of transducers that include saidloudspeaker system; and processing said received input speech utteranceto enhance recognition quality.
 14. The method of claim 13 wherein saidstep of selectively interrupting playback is performed automatically.15. The method of claim 13 wherein one of said plurality of transducersis a microphone and wherein said step of selectively interruptingplayback is performed in response to receipt of a speech utterance viasaid microphone.
 16. The method of claim 13 wherein one of saidplurality of transducers is a microphone and wherein said step ofselectively interrupting playback is performed in response to speechrecognition performed upon a speech utterance received via saidmicrophone.
 17. The method of claim 13 further comprising controlling atleast one function of an electronic component based on the results ofsaid step of performing recognition.
 18. The method of claim 13 furthercomprising processing said received input speech utterance by performingsource separation using separate signals input respectively through saidloudspeaker system and through a microphone.
 19. The method of claim 18wherein said source separation is performed using independent componentanalysis to separate speech and noise.
 20. The method of claim 13further comprising using separate signals input respectively throughsaid loudspeaker system and through a microphone to ascertain spatialinformation from the input speech utterance.
 21. The method of claim 20further comprising using said spatial information to discriminatebetween a plurality of concurrent recognition system users.
 22. Themethod of claim 20 further comprising controlling at least one functionof an electronic component based on the results of said step ofperforming recognition and further based on said spatial information.23. A speech recognition system for incorporation into an automotivevehicle having an audio system with an associated loudspeaker system,comprising: a sound transducer for input of speech utterances from arecognition system user; a multi-channel signal processing systemconfigured to receive speech utterance input from said microphone andfrom at least one additional audio source, the signal processing systembeing operative to enhance the recognition quality of the receivedspeech utterance; a switching system coupled to said signal processingsystem and to said loudspeaker system, the switching system beingoperative to selectively couple said loudspeaker system to said signalprocessing system thereby utilizing said loudspeaker system as saidadditional audio source.
 24. A speech recognition system for integratinginto a home entertainment system having a loudspeaker system forplayback of audio content, comprising: a sound transducer for input ofspeech utterances from a recognition system user; a multi-channel signalprocessing system configured to receive speech utterance input from saidmicrophone and from at least one additional audio source, the signalprocessing system being operative to enhance the recognition quality ofthe received speech utterance; a switching system coupled to said signalprocessing system and to said loudspeaker system, the switching systembeing operative to selectively couple said loudspeaker system to saidsignal processing system thereby utilizing said loudspeaker system assaid additional audio source.
 25. A signal processing system to improvespeech recognition, comprising: a loudspeaker system for playback ofaudio content; a microphone for input of speech utterances from arecognition system user; a multi-channel signal processing systemconfigured to receive speech utterance input from said sound transducerand from at least one additional audio source, the signal processingsystem being operative to enhance the recognition quality of thereceived speech utterance; a switching system coupled to said signalprocessing system and to said loudspeaker system, the switching systembeing operative to selectively couple said loudspeaker system to saidsignal processing system thereby utilizing said loudspeaker system assaid additional audio source.
 26. A speech recognition system,comprising: a loudspeaker system for playback of audio content; a soundtransducer for input of speech utterances from a recognition systemuser; a recognizer configured to receive speech utterance input fromsaid sound transducer and configured to use at least one acoustic modelin the recognition of said speech utterance input; an acoustic modelselection system coupled to said recognition system and being operativeto selectively control the acoustic model used by said recognitionsystem based on at least one environment parameter; said acoustic modelselection system further using said loudspeaker system to assist indetermining the acoustic model used by said recognition system.
 27. Therecognition system of claim 26 wherein said sound transducer is amicrophone.
 28. The recognition system of claim 26 wherein said soundtransducer is a loudspeaker coupled to receive audio input.
 29. Therecognition system of claim 26 wherein said acoustic model selectionsystem includes a pool of acoustic models and operates to select a modelfrom said pool for use by the recognizer, based on at least oneenvironment parameter.
 30. The recognition system of claim 26 whereinsaid loudspeaker system is coupled to receive audio input from theenvironment and the recognition system further includes acoustic modeltraining module to provide at least one acoustic model for use by therecognizer.
 31. The recognition system of claim 26 wherein saidloudspeaker system is coupled to receive audio input from theenvironment and the recognition system further includes acoustic modeltraining module to provide a plurality of acoustic models for use by therecognizer for different background noise conditions.
 32. Therecognition system of claim 26 further comprising a decision logic unitcoupled to said model selection system that ascertains at least onevehicle operating parameter and causes the model selection system tochange at least one aspect of the acoustic model used by the recognizerbased on said operating parameter.
 33. The recognition system of claim26 further comprising a decision logic unit coupled to said modelselection system that ascertains at least one vehicle environmentalcharacteristic and causes the model selection system to change at leastone aspect of the acoustic model used by the recognizer based on saidenvironmental characteristic.
 34. The recognition system of claim 26further comprising a noise compensation module for performing signalprocessing prior to recognition, the noise compensation module beingconfigured to selectively perform said signal processing based on saidat least one environmental parameter.
 35. The recognition system ofclaim 34 further comprising feature extraction module that processessaid speech utterance input and wherein said noise compensation moduleoperates upon a speech information signal derived from said featureextraction module.
 36. A speech recognition system, comprising: arecognizer configured to receive speech utterance input from a soundtransducer and configured to use at least one acoustic model in therecognition of said speech utterance input; an acoustic model selectionsystem coupled to said recognition system and being operative toselectively control the acoustic model used by said recognition systembased on at least one sensed operating parameter; a decision logic unitcoupled to said module selection system that ascertains at least onevehicle operating parameter and causes the model selection system tochange at least one aspect of the acoustic model used by the recognizerbased on said operating parameter.
 37. The speech recognition system ofclaim 36, wherein: said decision logic unit further ascertains vehiclelocation and causes the model selection system to change at least oneaspect of the acoustic model used by the recognizer based on saidlocation.
 38. The speech recognition system of claim 36 wherein saidacoustic model selection system is configured to selectively control theacoustic model used by the recognition system during an ongoingrecognition session.